External Service Integration Tools
This document explains the external service integration tools that connect the application to GitHub repositories, Gmail, Google Calendar, and Google search. It covers the service abstraction layer, authentication mechanisms, API integration patterns, request/response handling, error management, and practical usage examples. It also addresses rate limiting, API quotas, authentication security, and service-specific troubleshooting approaches.
The integration is organized around four primary services:
GitHub crawler and summarization
Gmail operations (listing, fetching, marking read, sending)
Google Calendar operations (listing events, creating events)
Google search via Tavily
Each service is backed by a dedicated service class and thin router endpoints. Tools encapsulate low-level API calls to external providers.
Diagram sources
Section sources
GitHubService: Orchestrates repository ingestion, optional file attachment processing via a generative client, and LLM-driven answer generation.
GmailService: Provides list/unread, fetch latest, mark read, and send operations using access tokens.
CalendarService: Provides list events and create event operations using access tokens.
GoogleSearchService: Executes web search via Tavily and returns structured results.
Each service exposes straightforward methods that wrap tool-level functions and handle exceptions consistently.
Section sources
The system follows a layered architecture:
Routers define HTTP endpoints and validate inputs.
Services encapsulate business logic and orchestrate tool invocations.
Tools implement provider-specific API calls and return normalized results.
Logging and error propagation ensure robust operation.
Diagram sources
GitHub Crawler Utilities#
The GitHub crawler converts a repository URL into a structured representation suitable for LLM consumption. It normalizes URLs, ingests repository metadata and content, and truncates content to fit within token limits.
Key behaviors:
URL normalization to repository root
Asynchronous/synchronous ingestion fallback
Content truncation to prevent context window overflow
Optional attachment processing via a generative client when a file path is provided
Diagram sources
Implementation highlights:
URL normalization prevents invalid paths from being passed to ingestion.
Ingestion supports both async and synchronous modes with graceful fallback.
Content is truncated conservatively to preserve tree and summary while bounding payload size.
Section sources
Gmail Integration Tools#
The Gmail service provides four operations:
List unread messages
Fetch latest messages
Mark a message as read
Send an email
All operations require an OAuth 2.0 access token and use the Gmail REST API.
Diagram sources
Implementation highlights:
Access token is passed via Authorization header for all endpoints.
Requests include timeouts to avoid hanging operations.
Responses are normalized into lists of message metadata or send results.
Section sources
Calendar Management Tools#
Calendar operations support listing upcoming events and creating new events. Both rely on the Google Calendar API and require an access token.
Diagram sources
Implementation highlights:
ISO 8601 string validation for start/end times ensures compatibility with Google Calendar.
Listing events filters by time and orders by start time.
Creating events sets timezone to UTC and returns the created event resource.
Section sources
Google Search Capabilities#
The Google search service integrates with Tavily to perform web searches and return a list of results containing URL and content preview.
Diagram sources
Implementation highlights:
Max results is configurable per request.
Results are normalized to include URL and content preview.
Unexpected response formats are handled gracefully with logging.
Section sources
The service layer depends on tool modules that encapsulate provider-specific logic. Routers depend on services and enforce input validation and error translation to HTTP responses.
Diagram sources
Section sources
GitHub repository ingestion: Content is truncated to a bounded size to fit within model context windows. Large repositories may still exceed limits; prefer targeted questions or smaller subsets.
Gmail operations: Requests include timeouts to prevent long waits. Batch operations should be designed to avoid exceeding provider quotas.
Calendar operations: Listing events filters by current time and orders results; consider reducing max_results for performance.
Google search: Max results can be tuned to balance quality and latency.
[No sources needed since this section provides general guidance]
Common issues and resolutions:
Authentication failures
Ensure the access token is valid and not expired.
Verify scopes for Gmail and Calendar APIs match required permissions.
Confirm the Authorization header is set correctly in requests.
Rate limiting and quota errors
Implement retry with exponential backoff for 429/5xx responses.
Reduce request frequency or batch size.
Monitor provider dashboards for quota usage.
GitHub repository ingestion
Invalid repository URL: Use the repository root (owner/repo).
Private or inaccessible repositories: Ensure public access or proper authentication.
Excessive content size: Ask focused questions or reduce repository scope.
Gmail operations
Empty results: Verify label filters and query parameters.
Message modification failures: Confirm message ID and UNREAD label presence.
Calendar operations
Invalid time format: Ensure ISO 8601 strings for start/end times.
Event creation errors: Validate timezone and required fields.
Google search
Empty results: Adjust query or increase max_results.
Unexpected response format: Inspect tool logs for parsing anomalies.
Section sources
The external service integration layer cleanly separates concerns across routers, services, and tools. It standardizes authentication via access tokens, provides robust error handling, and offers practical workflows for GitHub crawling, Gmail operations, Calendar management, and web search. Following the guidance above will help ensure reliable, secure, and efficient integrations.
[No sources needed since this section summarizes without analyzing specific files]
API Endpoints and Usage Patterns#
GitHub
Endpoint: POST /
Inputs: url, question, chat_history, optional attached_file_path
Behavior: Ingests repository, optionally attaches a file, and generates an answer.
Gmail
/gmail/unread: Lists unread messages with max_results.
/gmail/latest: Fetches latest messages with max_results.
/gmail/mark_read: Marks a message as read using message_id.
/gmail/send: Sends an email to a recipient with subject and body.
Calendar
/calendar/events: Lists upcoming events with max_results.
/calendar/create: Creates an event with summary, start_time, end_time, description.
Google Search
/google_search/: Performs a web search with query and max_results.
Section sources